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* ABSTRACT 


The  ideas  of  a stochastic  process  of  clustering  came  to  idle  author's  attention 
from  Dr.  Geoffrey  Beall,  an  entomologist  interested  in  the  distribution  of  larvae 
over  an  experimental  field.  Lprvae  are  born  from  eggs  deposited  by  moths,  not 
singly,  but  in  J*egg-masses.w^/Vfter  hatching,  larvae  begin  to  crawl  in  search  of 
food.  Later,  a “"general  eensud*“bf  larvae  is  performed.  The  r.v.  of  interest 
X = no.  of  larvae  counted  in  a unit  area  plot  in  the  field.  Conceptual  elements: 
cluster  centers  (=  egg-masses) , cluster  size  (=  no.  of  larvae  from  a single  egg- 
mass),  dispersal  of  cluster  members.  Over  the  four  decades  since  the  publication 
of  the  theory  relating  to  larvae,  essentially  the  same  mechanism  of  clustering 
was  found  to  underly  many  diverse  natural  phenomena:  clustering  of  galaxies, 
population  dynamics,  epidemics  and  effects  of  irradiation  of  living  cells. 
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CLUSTERING: 

PI  MINI  SCI  NCI  S 01  SOME  I 1'ISOOI  s 
IN  MY  R[ SI  ARCH  ACTIVITY* 

Jerzy  Neyman 
Statistical  Laboratory 
University  of  California,  Berkeley  04 720 


1.  Introduction.  It  is  a pleasure  to  he  able  to  deliver 
this  Second  Pfizer  Annual  Colloquium.  In  selectinq  its  subject 
I thought,  of  work  in  our  Berkeley  Stat.  Lab.  relating  to  pharma- 
cology. However,  as  of  now,  this  work  is  not  sufficiently  ad- 
vanced to  be  reported  on  this  important  occasion. 

The  subject  I selected  covers  a long  series  of  intercon- 
nected studies  in  several  substantive  domains,  all  of  them  re- 
flecting the  inspiration  I received  in  the  late  1930's  from 
Or.  Geoffrey  Beall,  then  of  the  Dominion  Entomological  exper- 
imental Station,  Chatham,  Ontario.  Regretfully,  I have  no 
references  to  Dr.  Beall's  publications. 

It  happened  that.,  while  Dr.  Beall's  preoccupation  was  with 
a special  kind  of  entomological  experiment,  the  idea  for  which 
I feel  indebted  to  him,  that  of  the  phenomenon  of  clustering, 
proved  to  be  very  reinvent  in  the  following  diverse  domains: 

(i)  in  the  study  of  spatial  distributions  of  galaxies,  (ii)  in 


♦Approximate  text  of  the  presentation  at  The  Pfizer  Colloquium  at  the 
Department  of  Statistics,  The  University  of  Connecticut,  April  30,  1 °70 . 
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population  dynamics,  (iii)  in  tt''  theory  of  epidemics,  and 
(iv)  in  the  study  of  radiation  carcinogenesis. 

While  this  presentation  is  intended  to  reflect  my 
personal  experiences,  it  may  also  be  considered  as  a contribution 
to  the  history  of  a concept,  the  concept  of  "clustering."  This 
historical  sketch  covers  four  decades.  The  unavoidable  conse- 
quence is  that  most  of  the  developments  described  are  "symbolized" 
rather  than  studied  in  depth.  Still,  my  hope  is  that  the  evolution 
of  the  original  simple  concept  will  be  found  intelligible. 

The  plan  of  the  present  paper  is  as  follows.  Section  2 
outlines  the  original  problem  of  Dr.  Beall,  concerned  with  counts 
of  larvae  in  plots  of  an  experimental  field.  The  corresponding 
stochastic  process  may  be  labeled  that  of  a "single  clustering." 

Section  3 is  also  concerned  with  the  "sinole  clustering" 
process.  However,  the  substantive  domain  is  very  different: 
distribution  of  galaxies  in  space. 

The  subjects  of  all  the  subseouent  sections  are  concerned 
with  sequences  of  consecutive  clusterings,  that  is,  of  the  process 
of  clustering  of  clusters.  The  natural  phenomena  studied  include 
population  dynamics  (Section  4),  radiation  carcinogenesis  (Section 
5),  and  theory  of  epidemics,  first  "outdated"  (Section  6)  and 
later  "modernized"  (Section  7).  Section  R:  Concluding  remarks. 

2.  Single  Clustering:  Counts  of  Larvae  in  Plots  of  an 
Experimental  Field.  Consider  a large,  reasonably  uniform,  ex- 
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perimental  field  divided  into  a number  of  unit  area  plots. 
Consider  one  of  these  plots,  which  it  will  be  convenient  to 
describe  as  "target."  At  a particular  period  durinq  the 
summer,  moths  are  flyinq  over  the  field  and,  from  time  to  time, 
deposit  their  eggs.  These  eggs  are  not  deposited  singly  but 
in  "masses,"  each  composed  of  a large  number  of  eggs.  In  due 
course,  these  eggs  produce  larvae  which  begin  to  crawl  in  search 
of  food.  After  a certain  period  of  time,  when  larvae  are  some- 
what larger  and  convenient  to  count,  a general  census  is  per- 
formed and  our  interest  is  concerned  with  the  number,  say  X, 
of  larvae  counted  in  the  target  plot. 

The  concept  of  clustering  is  connected  with  the  fact  that 
the  larvae  cannot  travel  fast.  The  possibility  of  one  of  them 


being  found  in  the  target  plot  depends  on  the  distance  of  the 
egg-mass  ( = "cluster  center")  from  which  the  given  larva  emerged. 
Thus,  the  conceptual  counterpart  of  the  empirical  phenomenon 
relating  to  a single  "target"  plot  must  involve  a bigger  area, 
the  "area  of  accessibility"  surrounding  the  target. 

The  whole  mechanism  of  clustering  involves,  then,  the 
following  concepts:  (i)  the  distribution  of  eog-masses  ( "cluster 
centers")  over  the  field,  (ii)  the  number  of  larvae  from  a single 
mass  surviving  up  to  the  census  ( number  of  cluster  members,  or 
"size”  of  the  cluster),  (iii)  the  mechanism  of  "dispersal"  of 
cluster  members  and  the  implied  "area  of  accessibility." 
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Naturally,  the  details  of  these  three  conceptual  elements 
must  vary  from  one  empirical  domain  of  study  to  the  next. 
Figures  land  2\both  representing  the  details  of  the  original 
publication  of  193°,  are  intended  to  "symbolize"  the  contem- 
porary thinking.  It  will  be  seen  that  at  the  time  when  the 
paper  was  written,  in  1936  or  1937,  Dr.  Peall  and  1 did  not 
dream  that  the  mechanism  of  clustering  could  be  relevent  to 
the  understanding  of  phenomena  of  clustering  of  galaxies,  etc. 
The  then  contemplated  applications  were  "entomoloov"  and 
"bacteriology. " 
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Figure  1.  A detail  of  the  title  page  of  the  original  paper  of  1939. 
Ann,  Math.  Stat.,  Vol . 10  (1939),  pp,  36-67. 
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TABLE  I 

Distribution  of  European  corn  borers  in 
120  groups  of  S hills  each,  ( data  pro- 
vided by  Dr.  Beall),  fitted  by  Poisson 
Law  and  by  type  A Law  with  two 
parameters 


Frequency 

No.  of  — - — 


borers 

Exp  P L. 

Ob- 

served 

Ex  p. 

T A 

0 

5.0 

24 

22.6 

1 

1G.0 

16 

16.7 

2 

25. 3 

16 

. 18.3 

3 

26.7 

18 

16.4 

4 

21.1 

15 

13.4 

5 

13.4 

9 

10.3 

6 

7.1 

6 

7.5 

7 

3.2 

5 

5.2 

8 

1 .3 

3 

3.5 

9 

.4 

4 

2.3 

10 

■1 

3 

1 .5 

11 

0 

12 

1 

Beyond 

2.3 

Ml 

— 

2. 178 

m 2 

— 

1 .454 

l\ * 

.000,000 

.95 

TABLE  II 

Distribution  of  yeast  cells  in  1,00  squares 
of  haemacytometer  observed  by  “Stu- 
dent” (1907),  fitted  by  Poisson  Imw 
and  by  type  A Law  with  two  param- 
eters 


No.  of 
cells 

Frequency 

Exp  P.  L 

Ob- 

served 

'Exp.  T.  A. 

0 

202 

i 213 

1 214.8 

1 

138 

128 

, 121.3 

2 

47 

37 

45.7 

3 ! 

ii 

18 

13.7 

4 

3 

3.6 

5 | 

1 

.8 

Beyond 

2 i 

— ! 

.1 

mi 

1 

3.605 

Till 

— | 

- j 

.189 

1 

> .02 

! 

> .1 

Figure  2.  Two  tables  published  at  the  end  of  the  original 
paper  of  1930. 
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3.  Clustering  of  Galaxies.  As  is  generally  known,  Cal- 
ifornia is  the  "land  of  big  telescopes."  Having  lived  in 
Berkeley  since  August,  1938,  it  was  unavoidable  for  me  to  be- 
come exposed  to  statistical  problems  of  astronomy  and.  more 
specifically,  of  "extragalactic  astronomy"  or  cosmology.  In 
particular,  1 must  record  the  inspiring  influence  of  two  "red- 
blooded"  astronomers,  N.U.  Mavall  and  C.O.  Shane,  at  the  time 
both  at  the  Lick  Observatory,  of  which  Shane  was  the  director. 

The  principal  subject  of  our  studies  was  the  question  whether, 
by  and  large,  the  distribution  of  galaxies  in  space  is  clustered, 
or,  as  was  broadly  believed,  are  the  galaxies  distributed  in 
space  singly,  perhaps  approaching  a Poisson  process.  Begin- 
ning with  1952,  there  resulted  a substantial  sequence  of  pub- 
lications, frequently  co-authored  bv  astronomers.  These  are 
exemplified  by  the  following  references: 

J.  Neyman  and  E.L.  Scott,  "A  theory  of  spatial  dis- 
tribution of  galaxies,"  Astrophvsical  Journ.,  Vol . 

116  (1952).  pp.  144-163.  

J.  Nevman,  C.D.  Shane  and  E.L.  Scott,  "On  the  spatial 
distribution  of  galaxies:  a specific  model," 
Astrophysical  Journ.,  Vol.  117  (1953)  pp.  92-133. 

E.L.  Scott,  "The  brightest  galaxy  in  a cluster  as  a 
distance  indicator,"  Astronomical  Journ.,  Vol.  52 
(1957).  pp.  278-295. 

J.  Neyman,  "Sur  la  thdorie  probabiliste  des  amas  de 
galaxies  et  la  verification  de  l'hvpoth^se  de  I'ex- 
ansion  dp  1‘ uni  vers,"  Annales  de  1 * I ns ti tut  Henri 
Poincarg,  Vol.  14  (1955),  pp.  2(11-244. 

J.L.  Lovasich,  N.U.  Mayall,  J.  Neyman,  and  E.L.  Scott, 

"The  expansion  of  clusters  of  galaxies,"  Proc.  fourth 
Berkelelv  Symp . on  Math.  Stat.  and  Prob. , (X  Neyman, 
ed.y,  Ifm v." oTTalWTTress , Berkeley  and  Los  Angeles, 
Vol.  3 (1961),  pp.  187-227. 
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It  will  be  realized  that,  compared  to  clustering  of  larvae 
in  a field,  the  cosmological  aspect  of  the  phenomenon  of  cluster- 
ing is  inmeasureably  more  complex.  One  reason  is  the  impossibility 
of  approaching  the  cluster  and  counting  its  members!  All  the 
astronomer  can  do  is  to  look  at  photographs  of  the  sky,  count 
on  them  the  images  of  galaxies,  studv  images  of  single  galaxies, 
and  also  the  spectra  of  the  light  they  emit.  The  ingenuity  the 
astronomers  exhibit  is  really  remarkable.  Also,  there  is  here 
a most  encouraging  east/west  intellectual  cooperation.  This 
I had  the  pleasure  of  describing  in  mv  latest  publication  dealing 
with  cosmology.  The  reference  is:  "Reminiscences  of  a Revo- 
lutionary Period  in  Cosmology,"  Problems  of  Physics  and  [ vo- 
lution of  the  Universe  (festschrift  for  V.A.  Ambartsumian), 

L.A.  Mirzovan,  ed. . Pub  1 . House  of  the  Armenian  Acad,  of 
Sciences,  Yerevan  (ld/H),  pp.  243-249. 

4.  Population  Dynamics:  Sequence  of  Clustering  of 
Clusters,  of  Clusters,  etc.  Here  the  most  relevant  publication, 
co-authored  with  f.l.  Scott,  has  the  title:  "On  a mathematical 
theory  of  populations  conceived  as  conglomeration  of  clusters." 

It  appeared  in  1%7  in  Vol . YXU  of  Proc.  Cold  Sorinq  Harbor 
Symposia  on  Quantitative  Hiologv,  pp.  1P9-120. 

The  problem  studied  can  be  summarized  as  follows.  Con- 
sider an  infinite  plane  H representing  the  "habitat"  and  let 


R, , P-. Pr  be  any  arbitrarily  selected  non-overlapping 
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regions  in  H.  Also,  let  nip  m^,  ...  , m$  be  s arbitary  non- 
negative integer  numbers.  The  characterization  of  the  dis- 
tribution of  a population  inhabiting  H is  understood  to  mean 
a rule  for  determining  the  probability  that,  simultaneously, 
the  numbers  of  the  population  members  in  R-j  will  be  exactly 
m-| , that  the  number  of  them  located  in  R2  will  be  exactly  m^, 
etc. 

In  other  words,  if  stands  for  the  number  of  population 
members  in  Rp  the  problem  was  to  deduce  the  formula  for  the 
probability  generating  function  of  the  random  variables  Xp 
X2 , . . . , X^ . 

The  mathematical  assumptions  used  in  the  work  are  reducible 
to  a repetition  of  the  process  of  single  clustering.  A litter, 
having  one  or  more  members,  is  born  at  a point  (=  cluster  center) 
in  H.  The  members  of  the  litter  (cluster)  disperse  and  gradually 
die  out.  Before  dying,  some  of  the  litter  members  produce  their 
own  litters  of  progeny,  etc.  The  particular  object  of  study  is 
the  ioint  distribution  of  two  successive  generations  of  the 
population,  the  paternal  and  the  filial.  Also,  some  asymptotic 
results  are  obtained. 

While  the  distribution  of  a species  over  the  habitat 
appears  as  a domain  very  different  from  that  of  the  distribution 
of  galaxies  in  space,  there  are  some  important  analogies.  In 
either  case  no  definitive  empirical  verification  of  the  hypo- 
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thetical  details  is  possible.  All  one  can  do  is  to  perform  a 
Monte  Carlo  simulation  and  to  compare  the  results  with  such 
fragmentary  observations  as  may  be  possible  to  accumulate. 

5.  Radiation  Carcinogenesis.  Out  of  the  problems  we  stud- 
ied in  our  Berkeley  Stat.  Lab.,  the  chance  mechanisms  qrovern- 
ign  carcinogenesis,  particularly  the  radiation  carcinogenesis, 
may  well  be  the  most  difficult  and,  perhaps, the  most  important. 

I try  to  describe  it  ahead  of  epidemics  for  the  reason  that,  at 
the  present  moment  there  is  something  like  a "lull"  in  our 
efforts . 

Obviously,  in  order  to  achieve  some  significant  results 
leadinq  to  the  understanding  of  the  chance  mechanism,  or  mechan- 
isms, in  living  cells  or  tissues,  a statistician  must  depend  upon 
interested  cooperation  of  an  experimenting  biologist.  It  happens 
that,  at  this  moment,  we  lack  the  necessary  contacts.  On  the 
other  hand,  as  will  be  described  in  Sections  6 and  7,  our  studies 
of  the  mechanisms  of  epidemics  develop  at  a reasonable  rate. 

Because  the  phenomenon  of  radiation  carcinogenesis 
appears  very  distant  from  that  of  the  distribution  of  larvae  and, 
certainly,  from  cosmology,  one  is  likely  to  believe  that  the  un- 
derlying chance  mechanisms  could  have  nothing  in  common.  Yet, 
closer  examination  indicates  the  contrary.  Such  differences  as 
exist  are  differences  of  complexity. 

The  problems  studied  and  the  results  obtained  are  summarized 
in  a relatively  recent  paper  written  jointly  with  Prem  S.  Puri. 
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This  paper  being  just  a summary  report,  the  best  way  to  "symbolize" 
briefly  the  essence  of  our  efforts  seems  to  be  throuqh  extensive 
quotes  from  our  paper,  including  pictorial  illustration. 

Figure  3 reproduces  the  title  paae  of  our  article.  The 
article  is  concerned  with  the  chance  mechanism  of  damaqe  to  livinq 
cells  caused  by  irradiation.  One  aspect  of  the  "damaqe"  may  be 
the  cell  becoming  cancerous.  In  parallel  with  the  "damaqe"  we 
consider  the  mechanism  of  possible  "repair."  Fiqures  4 and  6 
illustrate  the  observable  phenomena  that  our  "structural  model" 
is  intended  to  explain.  These  phenomena  include  the  difference 
between  the  so-called  "hiqh"  and  "low"  linear  energy  transfers  (LET. ) 

The  reader  will  notice  that  the  role  of  "egg-masses"  in  Dr. 
Beall's  studies  is  now  played  by  "primary"  particles  of  irradiation. 
Experiments  are  possible  to  estimate  their  temporal  distribution. 

This  contrasts  with  the  practical  impossibility  of  countinq  the 
eqq-masses.  On  the  other  hand,  while  in  Dr.  Beall's  situation 
basic  observables  are  counts  of  larvae  in  the  test  plots,  in  the 
problem  of  carcinoqenesis  the  corresponding  entities,  namely  the 
"secondary  particles"  and  their  "hits"  in  the  tarqets,  are  im- 
possible to  count.  The  exception  is  the  possibility  of  concluding 
that  the  number  of  "unrepaired"  hits  is  zero,  etc. 

One  easily  illustrated  common  element  is  the  "area"  (or 
"volume")  of  accessibility. 
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A structural  model  of  radiation  effects  in  living  cells 
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\n  mi|as«tanl  detail  to  U addesl  to  tlse  tat  t»  dlusti.le*!  m I 1. 

1 is  that  thr  pa  .In  Irs  in  ip  test  ion  t.  a*  el  at  emu  n...i»*  sjaasls  s.* 
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Figure  4.  Incidence  of  myeloid  leukemia  in  relation  to  dose  and 
dose  rate  of  gamma  radiation.  One  rad  = 0.01  d/kg. 


Figure  5.  Life  shortening  in  female  mice  as  influenced  bv 

dose  rate  of  gamma  rays  and  neutrons.  Open  symbols 
represent  gamma  rays;  filled  symbols,  neutrons. 
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6.  Theory  of  Epidemics  (Outdated).  Our  Stat.  Lab's  effort 
at  a theorv  of  epidemics  was  published  in  1964  in  a paper  co- 
authored by  F.L.  Scott  and  mvself.  The  reference  is  "A  Stochastic 
Model  of  Epidemics"  (Stochastic  Models  in  Medicine  and  Biology,  J. 
Gurland,  ed.  , The  University  of  Wisconsin  Press,  1964).  The  paper 
was  inspired  by  the  book  by  Norman  T.J.  Bailv  The  Mathematical 
Theory  of  Epidemics  that  summarized  ouite  a few  earlier  investi- 
gations, beainninc]  with  that  of  McKendric  of  19?6.  One  of  the 
basic  assumptions  of  many  of  these  works  was  that,  qiven  a oopu- 
lation  including  so  many  susceptibles  the  appearance  of  a sinole 
infectious  creates  a probability  of  contracting  the  disease  that 
is  the  same  for  each  of  the  susceptibles.  As  Bailey  points  out, 
any  assumption  of  this  kind  may  be  realistic  for  a dormitory  of 
a boarding  school  but  not  for  a citv  and  certainly  not  for  a 
country.  The  stochastic  model  we  produced  is  explicit  in  recog- 
nizing the  lack  of  uniformity  of  the  habitat.  For  a time,  the 
model  appeared  reasonably  realistic.  However,  during  the  winter 
quarter  of  1978,  there  came  the  awareness  of  an  important  lack 
of  realism.  The  ideas  underlying  an  effort  at  modernization 
are  described  in  the  next  section.  The  basic  assumptions  of  the 
"outdated"  theory  are  as  follows. 

(i)  Number  Infected  hy  a Sinole  Infectious.  Consider  an 
infinite  plane  H described  as  the  habitat.  It  is  assumed  that 
to  each  point  in  H with  coordinates  u (Up  u.,)  there  corresponds 
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a random  variable  v(u)  representing  the  number  of  susceptibles 
who  would  be  infected  if  at  that  point  there  was  a sinqle  in- 
fectious. While  the  actual  distribution  of  v(u)  is  left  unspecified, 
it  is  assumed  that  the  variables  v(u)  correspondinq  to  different 
points  in  the  habitat  are  mutually  independent.  In  fart,  it  is 
assumed  that  the  variables  v(u)  are  independent  of  all  other 
random  variables  of  the  system  and  that  plv(u)r0l<l. 

(ii)  Dispersal  of  Infected.  It  v^as  assumed  that  durinq  the 
"latent  period"  T (the  same  for  all  infected)  the  individuals  in- 
fected at  u travel  independently  from  each  other.  Furthermore, 
it  was  assumed  that  to  each  point  u in  the  habitat  there  corres- 
ponds a function  f(x|u)  representing  the  probability  density  of 
the  location  x=(x-|,  x?)  where  an  individual  infected 

at  u becomes  infectious.  Except  for  certain  conditions  of  regular- 
ity, the  function  f(x  |u) , the  "dispersal  function"  is  left  un- 
speci fied. 

(iii)  Immigration.  The  term  "immigrants"  is  used  to  describe 
real  infectious  immigrants  and  also  local  inhabitants  who  become 
infectious  "spontaneously"  perhaps  due  to  mutations  of  bacteria 

in  their  bodies.  It  is  postulated  that  the  appearance  of  an 
infectious  "immigrant"  is  governed  by  a density  function  \ (u) 
defined  over  the  whole  habitat  and  subject  to  certain  conditions 
of  regularity. 


(iv)  Discrete  Generations.  It  was  assumed  that  the  duration 
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of  infectiousness  is  zero  and  that  occurrences  of  infection, 
all  over  the  habitat  occur  simultaneously.  In  consequence,  the 
development  of  an  epidemic  is  divided  into  discrete  generations. 

The  relevant  mathematics  involves  the  following  two  concepts. 

(i)  The  random  variable  v(u),  the  number  infected  bv  a single 
infectious  at  a specified  point  u=(u^,  u.O  in  the  habitat,  and 

(ii)  the  random  variable,  sav  y(X|u),  the  number  infected  somewhere  in 
the  habitat  (X  is  a random  variable)  by  a single  individual  of 

the  earlier  generation  of  the  epidemic  who  became  infected  at 
a specified  point  u in  the  habitat. 

The  specification  of  a particular  kind  of  epidemic,  sav  of 
polio,  depended  on  two  families  of  functions,  v(u)  (=  the  sizes 
of  "clusters"  centered  at  u)  and  the  dispersal  function  f(x|u), 
both  subject  to  certain  conditions  of  regularity. 

The  subjects  of  study  included  the  possibility  that  an 
’ epidemic  started  by  a single  infectious  might  get  "out  of  hand," 
as  was  once  the  case  of  a polio  epidemic.  Among  the  particular 
cases  considered  there  was  the  possibility  that  a region  R.  marked 
by  highlv  hygienic  conditions,  will  escape  the  outbreak  of  a 
substantial  epidemic,  while  in  the  rest  of  the  habitat  that  same 
epidemic  will  get  "out  of  hand."  Two  particular  theorems,  which 
in  private  conversations  were  called  "democracy  theorems"  Indicated 
that  efforts  at  the  establishment  of  such  especially  "healthv" 
regions  would  be  futile. 


7.  Theory  of  Epidemics  (Modernized).  During  the  winter 
quarter  of  1978,  discussion  of  the  theory  of  epidemics  .iust 
described  benefited  by  the  DarticiDation  of  Mrs.  Florence 
Morrison  of  the  California  State  Department  of  Health.  Also, 
we  had  several  other  rather  interested  and  active  members  of 
the  group  of  whom  I shall  mention  two  visitors  from  abroad. 

Dr.  S.  Kwesi  Ddoom,  a Fulbright  Fellow  from  Uganda,  and  Dr. 

Luis  R.  Perrichi  from  Venezuela. 

Mrs.  Morrison's  most  valuable  contribution  was  the  remark 
that  the  real  habitats  represented  by  entire  countries  are 
much  more  heterogeneous  that  the  older  theory  presupposed. 

With  reference  to  an  epidemic  of  a communicable  disease,  Mrs. 
Morrison  contended  (and  everyone  agreed)  that  real  habitats, 
such  as  the  state  of  California,  are  stratified  according  to 
socio-economic  status  of  the  population.  This  stratification 
influences  the  development  of  an  epidemic.  At  the  very  least 
three  categories  of  locations  have  to  be  considered,  depending 
on  the  income  of  the  inhabitants:  "high,"  "middle,"  and  "low" 
(say  slums,  which  is  the  term  I used).  There  was  the  concensus 
of  opinion  that  the  number  infected  bv  a sinole  infectious 
depends  not  only  on  the  region,  say  R^ , in  which  the  infection 
takes  place,  but  also  on  the  reoion,  sav  R^,  where  the  infecting 
individual  lives.  For  example,  if  an  inhabitant  of  slums  suddenl 
becomes  infectious  in  the  locality  he  inhabits,  he  is  likely 
to  infect  many  more  people  around  him,  than  would  the  visiting 
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inhabitant  of  a high  income  region,  etc. 

The  discussion  that  followed  resulted  in  a somewhat  unusual 
"take  home  exam"  I formulated  last  year,  see  next  pane. 

The  result  of  the  exam  oroved  quite  intent's  fine  to  several 
participants  in  the  discussion,  including  mvself.  While  the  sub- 
division of  California  into  only  three  different  socio-economic 
regions  represented  an  obvious  over-simplification,  the  results 
obtained  appear  instructive  and  there  are  plans  afoot  to  produce 
a paper  for  publication. 

8-  Concluding  Remarks.  As  mentioned  at  the  outset,  in 
selecting  the  subject  of  this  Colloquium  presentation,  1 had  in 
mind  to  illustrate  the  phenomenon  of  evolution  of  an  idea.  The 
idea  of  the  mechanism  of  "clustering"  does  not  represent  any- 
thing unique.  Many  other  fruitful  ideas  also  evolve.  Otherwise, 
thev  would  hardly  be  considered  "fruitful."  Ordinarily,  the  pro- 
cess of  substantial  evolution  of  a simple  idea  takes  quite  some 
time,  much  longer  than  the  time  we  ordinarily  spend  in  learning 
our  contemporary  state  of  that  evolving  idea.  Thus,  the  phe- 
nomenon of  the  evolution  escapes  our  attention.  Yet.  it  seems 
interesting. 
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Mr.  Neyrnan 


TAKE  HOME  FINAL  EXAM 

Due  for  Delivery  and  discussion  on  Thursday,  March  23,  12:30-3:30  pm, 
in  Room  72  Evans. 

Your  presence  at  the  discussion  IS  NECESSARY. 


Instructions:  Write  clearly  and  tidily.  Use  ink  or  an  intense 

black  pcnc i 1 . 

* * * * * * * 


Problem  1.  State  the  basic  assumptions  of  the  theory  presented  in 
lectures  and  describe  the  principal  results  (e.g.  What 
are  the  "democracy"  theorems?).  Use  your  own  words. 

Do  not  copy  from  the  published  paper. 


Problem  2^.  Criticize  the  basic  assumptions  of  the  theory  even  with 

reference  to  communicable  diseases  spread  through  personal 
contacts  between  infectious  and  suscept ib les , like  cough- 
ing. How  should  the  basic  assumptions  be  modified  to 
make  the  theory  more  realistic? 


P rob  1 cm  3.  Use  computer  facilities  to  simulate  the  development  of  an 
epidemic  in  conditions  slightly  more  Tealistic  than  de- 
scribed in  the  early  part  of  the  course. 

Consider  a habitat  composed  of  three  regions  R. , and 
R3: 

R.  : high  income  region*  with  sparce  population  and 
facilities  for  travel. 

R?  : middle  income  region  (typified  by  Berkeley),  with 

substantially  denser  population  and  with  reasonable 
facilities  for  travel. 

R, : low  income  region,  with  very  dense  population  and 

very  limited  travel  opportunities. 

Assume  that  each  of  the  three  regions  is  uniform  in  all 
respects  and  assign  to  them  numerical  values  of  the  various 
relevant  parameters  (e.g.  of  probabilities  that  an  indi- 
vidual inhabiting  R.  will  become  infectious  in  R. , etc.). 
The  values  assigned1should  be  consistent  with  yoir  intui- 
tion, but  different  from  those  of  your  colleagues  (consult 
with  them!).  Next,  consider  an  epidemic  initiated  by  a 
single  individual  who  became  infectious  in  one  of  the 
three  regions.  Then,  use  the  Monte  Carlo  simulation  tech- 
nique to  generate  100  epidemics  and  calculate  the  mean 
number  of  cases  in  the  successive  generations  of  the  epi- 
demic and  the  mean  total  size  of  the  epidemic.  What  about 
the  "democratic"  theorems? 


Good  Luck! 
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